Goto

Collaborating Authors

 example data


KALIE: Fine-Tuning Vision-Language Models for Open-World Manipulation without Robot Data

Tang, Grace, Rajkumar, Swetha, Zhou, Yifei, Walke, Homer Rich, Levine, Sergey, Fang, Kuan

arXiv.org Artificial Intelligence

Building generalist robotic systems involves effectively endowing robots with the capabilities to handle novel objects in an open-world setting. Inspired by the advances of large pre-trained models, we propose Keypoint Affordance Learning from Imagined Environments (KALIE), which adapts pre-trained Vision Language Models (VLMs) for robotic control in a scalable manner. Instead of directly producing motor commands, KALIE controls the robot by predicting point-based affordance representations based on natural language instructions and visual observations of the scene. The VLM is trained on 2D images with affordances labeled by humans, bypassing the need for training data collected on robotic systems. Through an affordance-aware data synthesis pipeline, KALIE automatically creates massive high-quality training data based on limited example data manually collected by humans. We demonstrate that KALIE can learn to robustly solve new manipulation tasks with unseen objects given only 50 example data points. Compared to baselines using pre-trained VLMs, our approach consistently achieves superior performance.


Braced Fourier Continuation and Regression for Anomaly Detection

Sabuda, Josef

arXiv.org Machine Learning

In this work, the concept of Braced Fourier Continuation and Regression (BFCR) is introduced. BFCR is a novel and computationally efficient means of finding nonlinear regressions or trend lines in arbitrary one-dimensional data sets. The Braced Fourier Continuation (BFC) and BFCR algorithms are first outlined, followed by a discussion of the properties of BFCR as well as demonstrations of how BFCR trend lines may be used effectively for anomaly detection both within and at the edges of arbitrary one-dimensional data sets. Finally, potential issues which may arise while using BFCR for anomaly detection as well as possible mitigation techniques are outlined and discussed. All source code and example data sets are either referenced or available via GitHub, and all associated code is written entirely in Python.


ToMBench: Benchmarking Theory of Mind in Large Language Models

Chen, Zhuang, Wu, Jincenzi, Zhou, Jinfeng, Wen, Bosi, Bi, Guanqun, Jiang, Gongyao, Cao, Yaru, Hu, Mengting, Lai, Yunghwei, Xiong, Zexuan, Huang, Minlie

arXiv.org Artificial Intelligence

Theory of Mind (ToM) is the cognitive capability to perceive and ascribe mental states to oneself and others. Recent research has sparked a debate over whether large language models (LLMs) exhibit a form of ToM. However, existing ToM evaluations are hindered by challenges such as constrained scope, subjective judgment, and unintended contamination, yielding inadequate assessments. To address this gap, we introduce ToMBench with three key characteristics: a systematic evaluation framework encompassing 8 tasks and 31 abilities in social cognition, a multiple-choice question format to support automated and unbiased evaluation, and a build-from-scratch bilingual inventory to strictly avoid data leakage. Based on ToMBench, we conduct extensive experiments to evaluate the ToM performance of 10 popular LLMs across tasks and abilities. We find that even the most advanced LLMs like GPT-4 lag behind human performance by over 10% points, indicating that LLMs have not achieved a human-level theory of mind yet. Our aim with ToMBench is to enable an efficient and effective evaluation of LLMs' ToM capabilities, thereby facilitating the development of LLMs with inherent social intelligence.


The Role of AI in Accelerating Skill Development

#artificialintelligence

The age of artificial intelligence is dawning. In contrast to AI's many benefits is the fact that it will displace millions of people around the world from their current workplace roles, especially those in white-collar jobs such as customer service, copywriting, and computer programming. It has already started to do so. Yet AI also presents a wonderful opportunity to rethink how we develop new skills. Those that seize this opportunity will move forward into exciting roles of their choosing, equipped with new skills they learned with the support of AI. To mitigate the negative impacts of AI on our careers, we must evolve our methods of acquiring new skills. In this post, I share my recent experience of interacting with ChatGPT while exploring the impact of permanently closing the United States stock exchanges.


Discrete fully probabilistic design: towards a control pipeline for the synthesis of policies from examples

Ferrentino, Enrico, Chiacchio, Pasquale, Russo, Giovanni

arXiv.org Artificial Intelligence

We present the principled design of a control pipeline for the synthesis of policies from examples data. The pipeline, based on a discretized design which we term as discrete fully probabilistic design, expounds an algorithm recently introduced in Gagliardi and Russo (2021) to synthesize policies from examples for constrained, stochastic and nonlinear systems. Contrary to other approaches, the pipeline we present: (i) does not need the constraints to be fulfilled in the possibly noisy example data; (ii) enables control synthesis even when the data are collected from an example system that is different from the one under control. The design is benchmarked numerically on an example that involves controlling an inverted pendulum with actuation constraints starting from data collected from a physically different pendulum that does not satisfy the system-specific actuation constraints. We also make our fully documented code openly available.


Behind the Paper That Led to a Google Researcher's Firing

WIRED

Earlier this year, Google artificial intelligence researcher Timnit Gebru sent a Twitter message to University of Washington professor Emily Bender. Gebru asked Bender if she had written about the ethical questions raised by recent advances in AI that processes text. Bender hadn't, but the pair fell into a conversation about the limitations of such technology, such as evidence it can replicate biased language found online. Bender found the DM discussion enlivening and suggested building it into an academic paper. "I hoped to provoke the next turn in the conversation," Bender says.


The Big Ways Full Self Driving & Machine Learning Differ From Our Brains

#artificialintelligence

In a previous article, I discussed my long-term plan to learn more about machine learning, starting with the Elements of AI courses. While I'm only at the beginning of this journey, what I've learned so far has been very enlightening. It's tempting to see systems like Tesla's Full Self Driving (FSD) beta as a child that is learning by doing while we supervise and keep things safe. Eventually, we think, the "child" will grow up and be like us, and then maybe even be better at human driving than humans. After studying the basics more, it's clear that this isn't what machine learning does.


Importance and Functions of Kernel in Machine Learning - AI Objectives

#artificialintelligence

What does the first thing come into your mind when you read or listen to the word kernel? In my mind, it's a post in the army or as computer scientist its operating system kernel that is responsible for the operation and manages the hardware according to given instructions. But Kernel in Machine Learning is something else, its somehow like operating system kernel that manage the function learned by some model/trainer with some experience/examples/data points. Now, here is the question of what actually machine learning is? And what is experience/examples/data points? Here we will try to answer all these questions.


6 Machine Learning Concepts for Beginners

#artificialintelligence

In machine learning, the inputs that we have talked about above are called features. Features are a set of attributes assigned to a data point. The following example data set is a famous data set commonly used for machine learning practice problems known as "Boston housing prices". It consists of a set of features (highlighted red in the image below) relating to a house such as the age, average number of rooms and property tax values and a corresponding house price. For a machine learning model to be successful in performing its task a statistical relationship needs to exist between at least some of these features and the price of the house.


Big data in IBD: big progress for clinical practice

#artificialintelligence

Precision medicine holds great promise to improve the landscape of IBD course of care for an individual patient, providing the most beneficial therapy while minimising the risk. The ultimate goals of precision medicine include stratifying patients based on disease subtypes and severity, disease progression and treatment response using personal and clinical data coupled with molecular profiling data of patients.1 2 IBD, with its two main subtypes, Crohn's disease (CD) and UC, is a complex inflammatory disease with a wide range of contributing factors including host genetics, immune system, environmental exposures and the gut microbiome.3–5 The inherent complexity of the disease introduces a large number of confounding factors, which stand in the way of accurate diagnosis and precision medicine.6 The term'big data' is generally referred to as large volume of rapidly produced data from variable sources, known as the three'V's (volume, velocity and variety).7 Over the past decades, the production and availability of data that could inform healthcare has increased remarkably mainly due to technological advancements and falling costs of data generation.